State sampling dependence of the Hopfield network inference 



O 

(N 

wo: 
< 



Haiping Huang 1 ' {3 

1 Key Laboratory of Frontiers in Theoretical Physics, Institute of Theoretical Physics, 
Chinese Academy of Sciences, Beijing 100190, China 



'Department of Physics, 



The Hong Kong University of Science and Technology, Hong Kong, China 
(Dated: January 20, 2013) 



The fully connected Hopfield network is inferred based on observed magnetizations and pairwise 
correlations. We present the system in the glassy phase with low temperature and high memory 
load. We find that the inference error is very sensitive to the form of state sampling. When a 
single state is sampled to compute magnetizations and correlations, the inference error is almost 
indistinguishable irrespective of the sampled state. However, the error can be greatly reduced if the 
data is collected with state transitions. Our result holds for different disorder samples and accounts 
for the previously observed large fluctuations of inference error at low temperatures. 
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I. INTRODUCTION 



The inverse Ising problem, also known as Boltzmann 
machine learning, is recently widely studied in the con- 
text of network inference. As we know, a large number of 
elements interacting with each other may yield collective 
behavior at the network level. Encouragingly, the pair- 
wise Ising model was shown to be able to capture most 
of correlation structure in real neuronal networks 0, ■ 
The advent of techniques for multi-electrode recording or 
microarray measurement produces high throughput bio- 
logical data. The inverse Ising problem tries to construct 
a statistical mechanics description of the original system 
directly from these data, which helps to better under- 
stand how the brain or other biological networks repre- 
sent and process information Q. On the other hand, 
to test proposed efficient inverse algorithms, one can al- 
ternatively collect the required data, i.e., magnetizations 
{rrii} and two-point connected correlations {Cy}(i,j run 
from 1 to N and N is the number of elements in the net- 
work) from Monte Carlo simulations of a toy model 0- 
7]. Given the magnetizations and correlations, the un- 
derlying parameters (i.e., couplings and fields) of the 
pairwise Ising model are inferred to describe the statis- 
tics of the experimental data. In other words, the data 

is fitted with Pi s i ng (er) cx exp J2i<j JijCiVj + J2i h i a i > 
such that the predicted magnetizations and correlations 
are consistent with those measured, i.e., (o"i) Ising = 

(°i>data) (Wj) ising = ('Wdata- In this setting, we use 
er to represent the configuration of the system and each 
component takes ±1. Previous studies along this line fo- 
cused on the Sherrington-Kirkpatrick (SK) model [3, Q 
and the Hopfield model HQ. However, the influence of 
state sampling on the network inference was overlooked 
and in this work, we will illustrate this most important 



issue on the fully connected Hopfield network reconstruc- 
tion. We find that the quality of reconstruction depends 
on the way the data is collected via state samplings. A 
lazy Glauber dynamics can be easily trapped by a high- 
lying metastable state, however, in a finite system, it 
still has the possibility of a transition to a different state 
(free energy valley), provided that the amount of sam- 
pling time is chosen appropriately Q. If we present the 
system at low enough temperature T and high memory 
load a, these two different scenarios for state sampling 
will yield different qualities of network inference. The for- 
mer maintains a high inference error regardless of which 
state we sample, while the latter reduces the error sub- 
stantially. 

The paper proceeds as follows. The fully connected 
Hopfield network is defined in Sec.|H] We collect the data 
using a lazy Glauber dynamics and infer the network by 
message passing algorithm, which is also demonstrated in 
this section. Results and discussions are given in Sec. Mil 
We conclude this paper in Sec. IIVI 



II. FULLY CONNECTED HOPFIELD 
NETWORK AND ITS INFERENCE 

The Hopfield network has been proposed in Refs. [9l[Tfjj 
as an abstraction of biological memory storage and was 
found to be able to store up to 0.1447V random unbi- 
ased patterns [ll[ . If the stored patterns are dynamically 
stable, then the network is able to provide associative 
memory and its equilibrium behavior is described by the 
following Hamiltonian: 



H 



^ i j <7i<T j 



(1) 
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where <ii = +1 indicates the spiking of neuron i while 
Cj = —1 means the silence. Coupling between neuron 
i and j is symmetric and constructed according to the 
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Hebb's rule: 



I p 



(2) 



where {£f } are P stored patterns with each element £f 
taking ±1 with equal probability. These random stored 
patterns give rise to disorder leading to frustration in 
the low temperature. The ratio of the number of stored 
patterns to the network size N is defined as the memory 
load a, i.e., a = 

Our prime concern is the study of the fully connected 
Hopficld network inference. In this network, each neu- 
ron is connected to all the other neurons and no self- 
interactions and external fields are assumed. The equilib- 
rium properties of the fully connected Hopfield model has 
been addressed in Ref. [Tl| • In this work, we focus on the 



glassy phase which takes place when T < T g = 1 
At all finite a, this phase has a vanishing small overlap 
with any of the stored patterns. Furthermore, the replica 
symmetry solution for this phase is unstable and develops 
a hierarchically organized structure [HI Gil which leads 
to anomalously slow dynamic relaxation. The dynam- 
ics was shown to exhibit aging phenomena supporting 
the nontrivial structure of the phase space [HI . There- 
fore, starting from different initial configurations, the 
lazy Glauber dynamics will be trapped in different free 
energy valleys with high probability. For a finite sys- 
tem, free energy barriers around metastable states are 
always finite and the Glauber dynamics has the possibil- 
ity to escape from local minima of free energy landscape. 
Therefore, as an inverse problem, we are interested in the 
influence of state sampling on the network inference in 
this phase, and we look at individual disorder samples 
with N — 125, a — 0.2 in the low temperature T = 0.5, 
and expect analysis on these individual disorder samples 
will provide valuable information on the state sampling 
dependence of the network inference for a general con- 
text. 

To sample the state of the original model Eq. (JTJ, we 
apply a lazy (non-optimized) Glauber dynamics rule: 



P(pi — > — o"j) = — [1 — cTj tanhjS/ij] 



(3) 



where /3 is the inverse temperature and hi — Ylj-tj Jij^j 
is the local field neuron i feels. In practice, we first ran- 
domly generate a configuration which is then updated 
by the local dynamics rule Eq. ([3]) in a randomly asyn- 
chronous fashion. In this setting, we define a Glauber 
dynamics step as N proposed flips. As a lazy dynamics, 
we quench the system directly to the preset low tem- 
perature T — 0.5 without any annealing schemes and 
run totally 4 x 10 6 steps, among which the first 2 x 10 6 
steps are run for thermal equilibration and the other 
2 x 10 6 steps for computing magnetizations and correla- 
tions, i.e., m t = (cr 4 ) data , Cij = (oiCT,-) data - m,m 3 where 
(• ■ ■ ) data denotes the average over the collected data. The 



state of the network is sampled every 100 steps after ther- 
mal equilibration. 

Given the measured magnetizations and correlations, 
we attempt to infer couplings via susceptibility propa- 
gation (SusProp) update rule [3| which was shown to 
outperform other mean-field- type methods Q. Before 
introducing this rule, we define two kinds of relevant mes- 
sages. One is the cavity magnetization rrii-^j of neuron i 
in absence of neuron j; the other is the cavity suscepti- 
bility gi^j t k which is the response of the cavity field h^j 
to the small change of the local field of neuron k. The 
SusProp rule can be derived using belief propagation plus 
fluctuation-response relation [7] and is formulated as fol- 
lows: 



rrii — rrij^i tanh Jy 
1 — mirrij^i tanh J,j 



1 - mi 



n£di\j 



{m n ^i tanh J ni ) 2 



TI1CW 



1 / (1 + 0^(1-771^77^) 

2 \ (1 - C!y)(l + ""i-tiWi-ri) 



dj - (1 - mfjg^j 



rriimj 



(4a) 

tanh J n i9n^i,k 
(4b) 

+ (1 - e)J?j 
(4c) 
(4d) 



where di\j denotes neighbors of neuron i except j, Sik is 
the Kronecker delta function and e(<E [0, 1]) is introduced 
as a damping factor and should be appropriately chosen 
to prevent the absolute updated tanh(Jy ) from being 
larger than 1. Note that all couplings in Eq. (f?]) have 
been scaled by the inverse temperature /3. 

To evaluate the reconstruction performance of Sus- 
Prop, we define the inference error as A = 

1/2 



N(N-l) ^i<3 



Ttruc 



where J*„ is the inferred 



coupling while J'™ is the true one constructed accord- 
ing to Eq. ([2]). In Eq. (g}, ({m*}, {Cij}) serve as in- 
puts to the update rule. To run SusProp, we initially 
set all couplings to be zero and randomly initialize for 
every directed edge the message m,-_>.j G [—1.0, 1.0] and 
9i^rj,k — if i 7^ k and 1.0 otherwise. Then SusProp is 
iterated according to Eq. ^ until either all inferred cou- 
plings converge within a preset precision 77 or the max- 
imal number of iterations T m ax is reached. In practice, 
we set T m ax = 3000, i] = 10~ 4 and e varies from 0(1O~ 2 ) 
toO(10- 4 ). 



III. RESULTS AND DISCUSSIONS 

We simulate the fully connected Hopfield network of 
size N = 125 at T = 0.5 and a = 0.2 forcing the sys- 
tem to enter the glassy phase. The state sampling de- 
pendence of the network inference is illustrated in Fig. [T] 
To discriminate two kinds of scenarios for state sampling, 
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FIG. 1: (Color online) State sampling dependence of the 
network inference. We present the inference error for two 
distinct disorder samples. For the second sample, the first two 
state samplings give large inference errors while the last three 
samplings reduce the error substantially. Insets give their 
corresponding evolutions of Hamming distances between the 
current sampled configuration and the first sampled one. For 
the first sample, all five state samplings provide nearly the 
same inference errors and their evolutions of the Hamming 
distances are similar to that of the second state sampling of 
the second sample. Note that the state index for the last 
three samplings of the second sample means different paths 
of multiple states sampling and that for the others means 
different states where the sampling is confined. 



we track the evolution of Hamming distance between cur- 
rent sampled configuration and the first sampled one. By 

Hamming distance, we mean Hd = \ {\ — YliLi a i a i) 
where <x* is the current sampled configuration while <r° 
is the first sampled one. We also measure the energy for 
each sampled configuration during the whole state sam- 
pling process. For the lower inset of Fig. [IJ the energy 
fluctuates around —0.511 with fluctuation of order 0.023 
while around —0.510 for the upper inset with nearly the 
same fluctuation amplitude. It can be seen clearly that 
the inference error depends strongly on the way the state 
is sampled, regardless of which state the lazy dynamics 
visits. In the first type, the sampling is confined in a 
single free energy valley, or the same level of the fam- 
ily tree like structure of the phase space [H, UH . This 
case would probably occur since the limited amount of 
sampling time is not enough for the dynamics to escape 
from the current valley. Therefore, we observe one mean 
value of Hamming distance in the upper inset. Unfor- 
tunately, this type produces highly magnetized data es- 
pecially at the low temperature, which gives rise to the 
non-convergence of SusProp and a high inference error. 
It should be emphasized that all samplings with the sim- 
ilar feature of Hamming distance evolution, as the upper 
inset shows, exhibit nearly the same inference error ir- 
respective of the sample and the sampled state. In the 
second type, a transition to a different free energy valley 
or a higher level of phase space organization may happen 



due to the finiteness of the network if the temperature is 
not very low. We do observe this possibility in our sim- 
ulations as the lower inset shows. In this case, another 
larger mean Hamming distance appears during the state 
sampling. Since each free energy valley is visited by the 
Glauber dynamics with a probability proportional to its 
thermodynamical weight }16j . when state transitions oc- 
cur, the computed average of <Ji<Jj or <7j over all 2 x 10 4 
sampled configurations amounts to the weighted sum, 

i.e., (^i(Tj> data = E 7 ^ (ffiO-J> 7 > (^)data = E 7 W~< (<Ti) ^ 

where only a few states are considered depending on the 
actual state transitions in the sampling process and 7 is 
the state index the dynamics visits and w 1 is the asso- 
ciated thermodynamical weight and proportional to the 
exponential of minus its scaled (by /3) free energy [l2l[l7l]. 
That is to say, we have now access to the correlations 
as well as magnetizations in the form of weighted sum. 
This weighted sum actually attenuates the high polariza- 
tion of the supplied data and a relatively low inference 
error is achieved. In fact, SusProp converges in this case. 
For both types of state samplings, the result holds for 
other random samples, which accounts for the previously 
observed large fluctuations of inference error at low tem- 
peratures [ilo]]. 

Previous study Q emphasized that the inference error 
can be drastically reduced by increasing the number of 
independent observations, which is consistent with our 
results in the sense that state transitions would occur 
with a higher probability if the number of sampling time 
increases. Importantly, our work discovered further that 
if the number of sampling time takes moderate values, 
the sampling with state transitions can reduce the infer- 
ence error while the sampling without state transitions 
maintains a high inference error. 



IV. CONCLUSION 



In conclusion, our study implies that, to lower the in- 
ference error, one should select the most efficient way 
to sample the system particularly when the phase space 
of the original model develops hierarchically organized 
structure and the amount of sampling time is limited (e.g, 
2 x 10 4 in our current simulations). Sampling with state 
transitions seems to be most effective to infer the finite- 
size network structure. In real neuronal networks, such as 
retinal network presented with natural movie stimuli, the 
coexistence of negative and positive couplings can lead to 
frustration and thus the emergence of many metastable 
states [3 [l9[ . For instance, a recording from a salaman- 
der retina of 40 neurons showed that several metastable 
states appear reproducibly across multiple presentations 
of the same movie Our result for the state sam- 

pling dependence of the Hopfield network inference may 
have some implications for the inference of real neuronal 
networks P,0,IH|- 
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